Fix optimum compat #3

ZX-ModelCloud · 2024-12-04T07:58:12Z

No description provided.

…izer"

# Conflicts: # optimum/gptq/quantizer.py

…izer"

…t is not associated with a value

* align gptq check to transformers for supporting cpu * fix comment * gptqmodel Signed-off-by: jiqing-feng <[email protected]> * compatible with auto-gptq Signed-off-by: jiqing-feng <[email protected]> * fix compatible with auto-gptq Signed-off-by: jiqing-feng <[email protected]> * fix compatible with auto-gptq linear Signed-off-by: jiqing-feng <[email protected]> * revert unrelated changes Signed-off-by: jiqing-feng <[email protected]> * gptqmodel need use checkpoint_format (#1) * need checkpoint_format * default value of checkpoint_format is gptq * fix quantize * fix quantize * fix quantize * Update quantizer.py * need convert to v1 before gptqmodel save * back checkpoint_format to gptq after convert * cleanup code * sym=False is not supported with auto-gptq * add comments * cleanup code * Update quantizer.py * always convert v2 to v1 if checkpoint_format = "gptq" * Update quantizer.py --------- Co-authored-by: ZX-ModelCloud <[email protected]> Co-authored-by: Qubitium-ModelCloud <[email protected]> * Mod backend code (#2) * keep gptq_v2 if sym is false * use hf_convert_gptq_v1_to_v2_format, hf_convert_gptq_v2_to_v1_format, and hf_gptqmodel_post_init * no need check backend * use device_map * cleanup * Update quantizer.py * move import --------- Co-authored-by: Qubitium-ModelCloud <[email protected]> * fix format and log Signed-off-by: jiqing-feng <[email protected]> * fix version check Signed-off-by: jiqing-feng <[email protected]> * enable gptqmodel tests Signed-off-by: jiqing-feng <[email protected]> * update check quant type Signed-off-by: jiqing-feng <[email protected]> * Fix optimum compat (#3) * add meta info * cleanup * cleanup * The value of quantizer should be an array * Update quantizer.py * If is_auto_gptq_available() also writes "auto_gptq:version" to "quantizer" * If is_auto_gptq_available() also writes "auto_gptq:version" to "quantizer" * Update quantizer.py * cleanup * comment on meta * hf_select_quant_linear pass checkpoint_format * add todo fix * move convert code to quantizer.save() * Update quantizer.py * Optimize hf_convert_gptq_v2_to_v1_format() * Optimize hf_convert_gptq_v1_to_v2_format() * fix GPTQTestCUDA * hf_select_quant_linear() always set pack=True * gptqmodel.hf_select_quant_linear() now does not select ExllamaV2 * gptqmodel.hf_select_quant_linear() now does not select ExllamaV2 * GPTQQuantizer add backend * lower checkpoint_format and backend * cleanup * move backend to bottom * no need to check gptqmodel version for ipex support * Update import_utils.py * Update quantizer.py * fix UnboundLocalError: cannot access local variable 'version' where it is not associated with a value * make version var short * Update import_utils.py * fix unittest * use assertLessEqual --------- Co-authored-by: Qubitium-ModelCloud <[email protected]> Co-authored-by: LRL <[email protected]> * fix format and convert v2 to v1 Signed-off-by: jiqing-feng <[email protected]> * [Fix] all tensors not same device (#5) * fix device error * update gptqmodel version * fix test * fix format Signed-off-by: jiqing-feng <[email protected]> * add gptqmodel tests which contains cpu Signed-off-by: jiqing-feng <[email protected]> * fix all auto-gptq tests Signed-off-by: jiqing-feng <[email protected]> * revert tests Signed-off-by: jiqing-feng <[email protected]> * rm gptqmodel yaml Signed-off-by: jiqing-feng <[email protected]> * fix comment Signed-off-by: jiqing-feng <[email protected]> * enable real cpu tests by fp32 Signed-off-by: jiqing-feng <[email protected]> * fix test model name Signed-off-by: jiqing-feng <[email protected]> * keep the original device setting when using auto-gptq Signed-off-by: jiqing-feng <[email protected]> * Update optimum/gptq/quantizer.py Co-authored-by: Ilyas Moutawwakil <[email protected]> * Update optimum/gptq/quantizer.py Co-authored-by: Ilyas Moutawwakil <[email protected]> --------- Signed-off-by: jiqing-feng <[email protected]> Co-authored-by: LRL-ModelCloud <[email protected]> Co-authored-by: ZX-ModelCloud <[email protected]> Co-authored-by: Qubitium-ModelCloud <[email protected]> Co-authored-by: ZX-ModelCloud <[email protected]> Co-authored-by: LRL <[email protected]> Co-authored-by: Ilyas Moutawwakil <[email protected]>

ZX-ModelCloud and others added 9 commits December 4, 2024 07:56

add meta info

03f7476

cleanup

2fbd311

cleanup

1087c77

The value of quantizer should be an array

0b2b3e1

Update quantizer.py

10bd2e4

If is_auto_gptq_available() also writes "auto_gptq:version" to "quant…

86c9363

…izer"

Merge remote-tracking branch 'origin/add_meta_info' into add_meta_info

9cce331

# Conflicts: # optimum/gptq/quantizer.py

If is_auto_gptq_available() also writes "auto_gptq:version" to "quant…

007676d

…izer"

Update quantizer.py

48e947a

ZX-ModelCloud changed the title ~~add meta info~~ Add Meta Dec 4, 2024

ZX-ModelCloud changed the title ~~Add Meta~~ Add quantize_config.meta property Dec 4, 2024

ZX-ModelCloud and others added 13 commits December 4, 2024 09:13

cleanup

180987d

Merge remote-tracking branch 'origin/add_meta_info' into add_meta_info

cb7e522

comment on meta

c4a48c6

hf_select_quant_linear pass checkpoint_format

aa92e66

add todo fix

ca354b3

move convert code to quantizer.save()

4d28581

Update quantizer.py

bdfc2b3

Optimize hf_convert_gptq_v2_to_v1_format()

fb28f74

Optimize hf_convert_gptq_v1_to_v2_format()

216c1a6

fix GPTQTestCUDA

71e08f6

hf_select_quant_linear() always set pack=True

20d5e8b

gptqmodel.hf_select_quant_linear() now does not select ExllamaV2

80bc085

gptqmodel.hf_select_quant_linear() now does not select ExllamaV2

caa499b

ZX-ModelCloud changed the title ~~Add quantize_config.meta property~~ Fix optimum compat Dec 5, 2024

LRL and others added 5 commits December 5, 2024 17:24

GPTQQuantizer add backend

6acf8b9

lower checkpoint_format and backend

6a7a266

cleanup

88b2f99

move backend to bottom

3679a42

no need to check gptqmodel version for ipex support

1484ad4

Qubitium and others added 10 commits December 5, 2024 19:07

Update import_utils.py

6140129

Update quantizer.py

71faf1a

fix UnboundLocalError: cannot access local variable 'version' where i…

bb754bc

…t is not associated with a value

Merge remote-tracking branch 'origin/add_meta_info' into add_meta_info

4d32b48

make version var short

c09da17

Update import_utils.py

e9c5358

fix unittest

77dec80

Merge remote-tracking branch 'origin/add_meta_info' into add_meta_info

d5857f8

use assertLessEqual

556002c

Merge remote-tracking branch 'origin/add_meta_info' into add_meta_info

d90dad0

jiqing-feng merged commit 5979473 into jiqing-feng:gptq Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix optimum compat #3

Fix optimum compat #3

Uh oh!

ZX-ModelCloud commented Dec 4, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix optimum compat #3

Fix optimum compat #3

Uh oh!

Conversation

ZX-ModelCloud commented Dec 4, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants